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In J5) the authors applied the Abstract Interpretation approach for approximating the probabilistic 
semantics of biological systems, modeled specifically using the Chemical Ground Form calculus 0. 
The methodology is based on the idea of representing a set of experiments, which differ only for the 
initial concentrations, by abstracting the multiplicity of reagents present in a solution, using intervals. 
In this paper, we refine the approach in order to address probabilistic termination properties. More 
in details, we introduce a refinement of the abstract LTS semantics and we abstract the probabilistic 
semantics using a variant of Interval Markov Chains ll34l [131 [T9l . The abstract probabilistic model 
safely approximates a set of concrete experiments and reports conservative lower and upper bounds 
for probabilistic termination. 

1 Introduction 

Process calculi, originally designed for modeling distributed and mobile systems, are nowadays one 
of the most popular formalisms for the specification of biological systems. In this new application 
domain, a great effort has been devoted for adapting traditional models to characterize the molecular 
and biochemical aspects of biological systems. Among them ||32l |30l |2j, stochastic calculi, based on 
7r-calculus 112911311 . capture the fundamental quantitative aspect (both time and probability) of real life 
applications. The use of a process calculus as a specification language offers a range of well established 
methods for analysis and verification that could now be applied to biological system models. These 
techniques can be applied to complex biological systems in order to test hypotheses and to guide future 
in vivo experimentations. Stochastic simulators, e.g. If33l l27l l28ll for 7i-calculus, are able to realize 
virtual experiments on biological system models, while model checking techniques, recently extended 
also to probabilistic and stochastic models ED EEL support the validation of temporal properties. 

However, the practical application of automatic tools to biological systems revealed serious limi- 
tations. One specific feature of biological processes is that they are composed by a huge number of 
processes with identical behavior, such as thousands of molecules of the same type. Moreover, typically 
the exact concentrations of molecules are not known, meaning that the hypotheses have to be tested with 
respect to different scenarios. Thus, different experiments have to be realized and the state space of the 
models to be analyzed is often very large (even infinite). 

Static analysis techniques provide automatic and decidable methods for establishing properties of 
programs, by computing safe approximations of the (run-time) behavior. This approach has been suc- 
cessfully applied to purely qualitative process calculi for distributed and mobile systems, and recently 
also to biologically inspired process calculi, in order to validate safety as well as more complex temporal 
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Figure 2: Groupie automata 



In [H we have proposed an approximation technique, based on Abstract Interpretation QUI, able to 
address probabilistic temporal properties for a simple calculus, the Chemical Ground Form (CGF)|[3l. 
CGF is a fragment of stochastic 7r-calculus which is rich enough for modeling the dynamics of bio- 
chemical reactions. The abstraction is based on the idea of approximating the information about the 
multiplicities of reagents, present in a solution, by means of intervals of integers Q. The approach com- 
putes an abstract probabilistic semantics for an abstract system, which approximates the probabilistic 
semantics, namely the Discrete-Time Markov Chain (DTMC), for any corresponding concrete system. 
In particular, the validation of an abstract system gives both lower and upper bounds on the probability of 
temporal properties [ 17 ], for a set of concrete systems {experiments) differing only for the concentrations 
of reagents. 

The methodology is illustrated in Fig. Q] As usual, the DTMC of a concrete system is derived from 
the LTS semantics, by calculating the probability of each move. The technique of abstraction is based 
on the definition of a suitable abstract LTS semantics for abstract systems, which support the derivation 
of an abstract probabilistic model, represented by an Interval Markov Chains IT341 [T3l [T9l . In Interval 
Markov Chains transitions are labeled with intervals of probabilities, representing the uncertainty about 
the concrete probabilities; consequently, the validation of temporal properties reports lower and upper 
bounds, rather than exact values, which are obtained by considering the worst-case and best-case 
scenario w.r.t. all non-deterministic choices. Obviously, the key step of the translation from abstract 
LTS into the Interval Markov Chain consists in the computation of intervals of probabilities from the 
information reported by abstract transition labels. A quite precise approximation is achieved because 
the information reported by transition labels is profitably exploited in order to capture also relational 
information. 

Unfortunately, if one is interested in proving more complex properties of biological systems, such 
as probabilistic termination 11351 . the previously proposed abstraction is not sufficiently powerful. For 
probabilistic termination we have to calculate the probability to reach a terminated state, e.g. a state 
where the probability to move in any other state is zero. 

To illustrate probabilistic termination, we consider the "groupies" example proposed by Cardelli in 
several tutorials on biochemistry and also reported in [4]. The idea is to study how a set of entities 
collectives behave. The behavior of a single entity is represented by the automaton in Fig. O it has 
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two possible states, X and Y. A single automaton performs no interaction, while it may interact with 
other automata. Two automata in state X are stable since they both offer \a and lb and no interaction are 
possible. Analogously for two automata in state Y. If one automata is in state X and another is in state 
Y then either they can interact on channel a and both move to state X or they can interact on channel b 
and both move to state Y . No matter how many automata are in state X or in state Y initially, eventually 
the groupies form a single homogeneous population of all X or of all Y. Thus, these systems always 
terminate, namely they universally terminate. 

The limitation of the abstract LTS semantics, defined in ||5], is represented by hybrid states, namely 
abstract states representing concrete terminated as well as non terminated states. It should be clear 
that, given an abstract state, the most precise and correct intervals of probabilities, could be derived by 
considering the minimum and maximum exact probabilities, for each concrete move, respectively. Thus, 
for an hybrid state we would obtain very approximated intervals of probabilities, such as [0, 1], both for 
the self-loop and for any other move. This information says that some concrete states may loop forever, 
while others may move somewhere else. As a consequence, the lower and upper bound probabilities to 
reach a terminated state, from an hybrid state, are typically zero and one, respectively. This is the case 
of example for the CGF specification of groupies example, previously commented. 

In order to better capture probabilistic termination, we propose in this paper a refinement of our ap- 
proach, based on a modification of the abstract LTS semantics. More in details, the abstract transition 
relation is refined so that terminated and non-terminated states are properly separated, and consequently 
hybrid states are never generated. To this aim, it may be necessary to replace a single abstract tran- 
sition, corresponding to a given reaction, by a set of abstract transitions, leading to different abstract 
target states. Such distinct abstract transitions model the same reaction but with different concentrations 
of reactants. This situation induces a notion of conflict between abstract transitions; indeed, the corre- 
sponding reaction, for each concrete state, is approximated by exactly one of those abstract transitions. 
In this context, the labels of transitions precisely identify the interaction and can naturally be exploited 
to capture conflicts between abstract transitions. 

Once the abstract LTS semantics has been refined, the remaining problem is to generalize the trans- 
lation from the abstract LTS to the abstract probabilistic model. In order to maintain the information 
about conflict, recorded by abstract transition labels, we adopt a generalization of the original model, 
called Labeled Interval Markov Chains (IMC). In IMC the labels permit to more accurately represent 
the set of distributions represented by the interval of probabilities. We show that the technique of (21 
for computing intervals of probabilities from abstract transition labels can be successfully generalized, 
by finding out a good trade-off between precision and complexity. Finally, the soundness of the pro- 
posed technique is formalized following the approach of [5 ] (see also iTTOl ITT1 IT2l [34l IT9ll ^ which exploits 
suitable approximation orders, both on abstract LTS and on IMC. 

The paper is organized as follows. Section |2] introduces the CGF calculus and the LTS semantics, 
while Section [3] shows the probabilistic semantics in terms of a DTMC. Section 0] presents the refined 
abstract LTS semantics. Section |5]introduces the IMC model and finally, Section [6]presents the effective 
method to derive the abstract probabilistic semantics. 

2 Chemical Ground Form 

The CGF calculus |3 ] is a fragment of stochastic 7i-calculus |29l l27l without communication. Basic 
actions are related to rates, which are the parameters of the exponential distribution. We present the 
labeled transition system (LTS) semantics of CGF, proposed in [5 ], which supports more precise abstrac- 
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tions with respect to the original proposal of [3 ]. In this approach, processes are labeled, and transitions 
record information about the labels of the actions which participate to the move, about their rates, and 
about their number of occurrences (in place of the rate of the move as in PI). 

The syntax of (labeled) CGF is defined in Table[TJ We consider a set jV (ranged over by a,b,c, . . .) 
of names, a set S£ (ranged over by X,p . . .) of labels, and a set X (ranged over by X,Y ,....) of variables 
(representing reagents). 

A CGF is defined as a pair (E,P) where E is a species environment and P is a solution . The 
environment £ is a (finite) list of reagent definitions X, = 5, for distinct variables Xj and molecules Sj. 
We assume that the environment E defines all the reagents of solution E. A molecule S may do nothing, 
or may change after a delay or may interact with other reagents. A standard notation is adopted: T r 
represents a delay at rate r; a r and a r model, respectively, the input and output on channel a at rate r. A 
solution P is a parallel composition of variables, that is a finite list of reagents. 

Labels are exploited in order to distinguish the actions which participate to a move. To this aim, we 
consider CGF (E,P), where E is well-labeled, meaning that the labels of basic actions are all distinct. 
Moreover, given a label X € ££ , we use the notation E.X.X to indicate the process n .P provided that 
X = . . . + jc .P+ . . . is the definition of X occurring in E. We may also use J£(E.X) for the set of labels 
appearing in the definition of X in E. 

The semantics is based on the natural representation of solutions as multisets of reagents. A multiset 
is a function M : X — > N. In the following, we use for the set of multisets and we use \P\ for 
the multiset of reagents corresponding to a solution P. Moreover, we call M(X) the multiplicity of 
reagent X in the multiset M. We may also represent multisets as sets of pair (m,X), where m is the 
multiplicity of reagent X, using a standard notation, where the pairs with multiplicity are omitted. 
Over multisets we use the standard operations of sum and difference © and 0, such that VX € X: 
M@N{X) =M(X)+N{X) andMQN(X) =M(X)-N(X) where n-m = n-m if n-m > 0, otherwise. 

The evolution of a solution (w.r.t. a given environment E) is described by a labeled transition relation 
of the form 

M — — >M' 

where r £ M. + is a rate, G jF= Jzf U («£? x if ), A G Q = N U (N x N) such that arity{&) = arity(A). 
Here, reports the label (the labels) of the basic action (the basic actions), which participate to the move, 
A reports consistent information about the multiplicity, and r is the related rate. 

The transition relation for multisets is defined by the rules Table [2] (we are tacitly assuming to reason 
w.r.t. a given environment E). Rule (Delay) models the move of a process % r .Q appearing in the 
definition of a reagent X. The transition records the label X together with the multiplicity of X (e.g 
M(X)) as well as the rate r. Rule (Sync) models the synchronization between two complementary 
processes a r -Q\ and d r ^ .Qi appearing in the definition reagents X and Y (that may even coincide). 
The transition records the labels X and p together with the multiplicities of X and Y (e.g M(X) and 
M(Y)) as well as the rate r. 

We denote with LTS((£ , ,Mq)) = (S, —>,Mo,E) the LTS, obtained as usual by transitive closure, start- 
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E.X.X = T r x .Q 

(Delay) 



m w) ' r » (Me(i ; x))e[[!2l 



EX.X = a r x .Q x E.Y.jX = d/.Q 2 

(Sync) M (A ^- (M(z) ' M ^((Me(i,x))e(i,F))e[[6 1 ]]e[[e 2 ]] 

Table 2: Transition relation 

ing from the initial state Mo £ 5, w.r.t. to environment E. Note that, since environments are well-labeled, 
e.g. basic actions have distinct labels, the transitions from a state of the LTS are decorated by distinct 
labels too. Moreover, we use ££E7 ' 5? to denote the set of LTS. 

In the following, given a transition t = M ' r > M' we use label(?) to denote its label 0, and 
source(?),target(?) to denote its source state M and target M' , respectively. Similarly, for a set of transi- 
tions TS, we use label(rS) = {J teTS \abe\(t). We also use Js(M,M') = {t | source(f) =Mand target(f) = 
M'} and Ts(M) = {t | source(?) = M} for describing the transitions from a multiset M to a multiset M', 
and all transitions leaving from multiset M, respectively. 



3 Probabilistic Semantics 

We introduce the probabilistic model of DTMC and we briefly discuss the notion of probabilistic termi- 
nation [35]. We also introduce the probabilistic semantics of CGF proposed in 0. 

Dicrete-Time Markov Chains. 

Given a finite or countable set of states 5 C ^ we denote with 



SDistr(S) = {p\p:S^[0, 1]}, Distr(S) = {p | p e SDistr(S) and I MGS p(M) = 1} 

the set of (discrete) probability pseudo-distributions and of distributions on S, respectively. 

Definition 3.1 (DTMC) A DTMC is a tuple (S,P,L,M ) where: (i) S C Jt is a finite or countable 
set of states, Mo G S is the initial state; (ii) P: S — > Distr(»S) is the probability transition function; (Hi) 
L : S — > (S — > p(^f)) is a labeling function. 

In DTMC state transitions are equipped with probabilities, e.g. P(M)(M') reports the probability of 
moving from state M to state M '. In addition, L(M)(M') reports the set of labels corresponding to the 
moves from state M to state M' . Notice that we adopt a labeled version of the model in order to simplify 
the correspondence with the abstract models; the labels do not modify the probability distributions in the 
concrete model. We use for the set of DTMC. 

We are interested in probabilistic termination, e.g. on the probability to reach a state, which is 
terminated. Given a DTMC (S,P,L,M ), we say that a state M G S is terminated iff P(M)(M') = 0, for 
each M' € S with M' ^ M. 

The probability to reach a terminated state can be formalized by associating a probability measure to 
paths of a DTMC. Let (5,P,L,Mo) be a DTMC. A path % is a non-empty sequence of states of S. We 
denote the i-th state in a path 71 by n[i], and the length of % by |tt|. The set of (resp. finite) paths over 
S is denoted by (resp. FPaths(S)) Paths(5'), while C(M) denotes the set of paths starting from the state 
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M G S. In the following, for M G S and IT G C(M), P M (R) stands for the probability of the sets of paths 
IT (see [20] for the standard definition). 

Definition 3.2 (Probabilistic Termination) Let mc = (5, P, L,Mo) be a DTMC. The probability of reach- 
ing a terminated state, from M G S, is Reach mc (M) = Pm({^ £ C{M) \ n[ \n\ ] is terminated, and\/j,0 < 
j < I 71 !) 71 !/] iJ non-terminated}). 

Derivation of the DTMC. 

The derivation of a DTMC from the LTS is based on the computation of the probability of moving from 
M to M' , for any M and M' . To this aim, we extract the rate corresponding to the move from M to M' by 
exploiting the information reported by transition labels. 

Formally, for a transition t = M ®' A ' r > M' we define the corresponding rate as follows, 



rate (t) 



n-r = A, A = n, 

n-(m-\)-r = (X,n),A = (n,m),X,pl G«5f(£.X), 

n-m-r = (A,/l),A = (n,m),X G ££(E.X),\l G «£f(£.7),X ^ Y. 



As usual, for computing rate(f) it is necessary to take into account the number of distinct transitions 
t that may occur in the multiset M. Thus, the rate r of the basic action (actions) related to is multiplied 
by the number of distinct combinations appearing in M (by exploiting the information recorded by A). 

Then, we introduce functions R : S x S -> R >=0 and E : S -> R >=0 , such that 

R(M,M') = L eT s(M,M<) rate(0 E(M) = £ M , eS R(M,M'). 

Intuitively, R(M,M') reports the rate corresponding to the move from M to M', while E(M) is the 
exi/ rate. Finally, the probability of moving from M to M' is computed from R(M,M') and from the exit 
rate E(M), in a standard way. 

Definition 3.3 We define a probabilistic translation function H : — » swc/j f/ia£ H((S,— ► 
,M ,£)) = (5,P,L,M ), w/zere 

7. P : 5 — > Distr(5') is the probability transition function, such that for each M,M' G S: 

a) i/E(M) > 0, ^en P(M)(M') = R(M,Af')/E(Af); 

b) i/E(M) = 0, P(M)(M) = 1, W P(M)(M') = OforM' ^ M. 

2. L : S — > (S — > is a labeling function, ™c/i f/iaf, /or eac/i M,M' G 5, h(M,M') = label({/ G 

Ts(M,M') | rate(f) >0}). 

Due to the particular labeling of the LTS semantics, also the DTMC, modeling the probabilistic 
semantics of a CGF process, satisfies the properties that all transitions leaving from a state, are decorated 
by distinct labels. 

Example 3.4 The example of groupies, commented in the Introduction, can be formalized by the follow- 
ing environment, E.:= X = a\ X + bf .Y,Y = dy X + b t ).Y. 

Reagents X and Y may interact together in two possible ways; either along channel a or along 
channel b; both reactions have the same rate r. The former case models a duplication ofX, while the 
latter case models a duplication of Y. 
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Figure 3: The LTS and the corresponding DTMC 

Fig. \3\illustrates the LTS and the corresponding DTMC, for the CGF (E,Mq), where 
M = {(\,X),(2,Y)} Mi = {(2,X),(l,Y)} M 2 = {(3,X)} M 3 = {(3,Y)} 

The LTS reports for each state, except for states M 2 and M3, two transitions: label (A,ju) models 
the duplication ofX, while label (5,Tj) models the duplication ofY. The transitions record also the 
multiplicities of reagents X and Y and the corresponding rate. As a consequence, in the DTMC, states 
M 2 and M3 are terminated. By contrast, the states Mq and M\ have two different moves with the same 
probability. By calculating the probability to reach a terminated state from Mq we obtain exactly 1. 
Indeed, the probability to be stuck in the loop Mq-M\ is zero. □ 

4 Abstract LTS 

The abstract LTS semantics uses the same abstraction of multisets of (5], based on the approximation of 
the multiplicity of reagents by means of intervals of integers 0. Instead, the abstract transition relation 
is refined, and the related notions, needed for expressing soundness, are adapted accordingly. 

Abstraction of states. 

We adopt intervals of integers, ^ = {\m,n] \ m G N,n G NU {°°} Am < n}. Over intervals we consider 
the standard order C 7 , such that / □/ J iff min(l),max(l) G J. Moreover, we use U/ for the corresponding 
Lu.b.. 

The abstract states are defined by replacing multiplicities with intervals of multiplicities. Therefore, 
an abstract state is a function M° : X — > J 1 . We also use for the set of abstract states. 

Obviously, given a multiset M, there exists an abstract multiset M°, which is its most precise approx- 
imation. Indeed, each multiplicity, such as n, can be replaced with the exact interval [n,n]; for simplicity, 
we may even use n as a shorthand of [n,n]. In the following, a(M) stands for the best abstraction of a 
multiset M. Moreover, we use M°[I/X] for denoting the abstract state where the abstract multiplicity of 
reagent X is replaced by the interval / G J? . We adopt abstract operations of sum and difference, such 
that VX G X, 

M°e N {X)=M (X)+N°(X), I + J =[min(I)+min(J),max(I)+max(J)} 
M°e°N°{X) = M°(X) - N°(X), I — J = [min(I)—max(J) ,max(I)—min(J)] 

It is immediate to define the following approximation order over abstract states. 

Definition 4.1 (Order on States) LetM\,M^ G Jt° , we say that M^M^ iff, for each reagent X G X, 
M°(X)njM°(X). 
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The relation between multisets and abstract states is formalized as a Galois connection [8]. The 
abstraction function a : & 9 (^f) — ► reports the best approximation for each set of multisets 5; the 
l.u.b. (denoted by U°) of the best abstraction of each M G S. Its counterpart is the concretization function 
7 : — > &{^) which reports the set of multisets represented by an abstract state. We refer the reader 
to (H for the properties of functions (a, 7). 

Definition 4.2 We define a : &{Ji) — * Jt° and 7 : — ► SP^Jt) such that, for each S G 8?{J() and 
M° G Jt°: (i) a (5) = U°MeS a ( M )>' («) T( M °) = W I CC(M'%°M°}. 
Abstract transitions. 

The semantics of Q uses abstract transitions of the form M? &,A ' r > M? where G Jr? , A° G Q° = 

1 o z 

U (J 2 " x y), with arity(&) = arity(A°). Similarly as in the concrete case, reports the label (the 
labels) of the basic action (actions), A° reports consistent information about the possible multiplicities, 
while r is the rate. 

In the proposed approach, such a transition is intended to approximate all the concrete moves, cor- 
responding to label 0, for each multiset M\ approximated by the abstract state M\. This means that 

there exists a concrete transition M\ &,A,r > M2, where the multiplicity (multiplicities) A is included in the 
interval (intervals) A°, and M2 is approximated by the abstract state M|. 

Let us consider the environment E commented in Example 13.41 and a very simple abstract state 
such as Mq = {([1,2],X),([1,2],F)}. The abstract state Mq describes a set of experiments; thus, the 
abstract semantics has to model the system described by E, w.r.t. different initial concentrations. For 
approximating the duplication of X, i.e. the synchronization between X and Y along channel a, we would 
obtain 

(MMMMU]), withM o' = {([2 , 3 ],X),([0,l],n}. 

o 

In this way, however, a hybrid state Mq is introduced. Actually, Mq represents terminated multisets, 
where the concentration of reagent Y is zero, as well as non terminated multisets, where reagent Y is still 
available. 

It should be clear that the moves corresponding to (A,ju) could be better approximated by adopting 
two different abstract transitions, 

o o 

where M° = {([2,3], X), ([1, 1],F)} andM 3 ° = {([2,3], X), ([0,0], Y)}. In this representation the labels 
capture a relevant information because they express a conflict. Actually, each multiset represented by Mq, 
realizes a move corresponding to (A , ju) which is abstracted either by transition (a) or by transition (b). 

Table [3] presents the refined abstract transition rules (as usual, w.r.t. a given environment E). The 
rules are derived from the concrete ones, by replacing multiplicities with intervals of multiplicities. The 
following operators are applied both to the target state and to the intervals, appearing in the transition 
labels, in order to properly split the intervals, such as [0,n]. 

For X G X, we define N(X) = {(X = 0),(X > 0)}. Then, given an abstract state M° G Jt and 
U G K (X) we define 



V*(^°) 



M°[[0,0]/X] if Jt = (X = 0),M°(X) = [0,n],n > 
M°[[l,«]/X] if 5 = (X > 0),M°(X) = [0,n],n > 
M° otherwise 
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(Delay-a) 



(Sync-a) 



E.X.X = x r x .Q tteK(x) 



M° A ' (M ° (X))J " V p ((M e {(l,X)})©°a([[e]])) 



E.X.X =a r x .Q x E.Y.ix=d/.Q 2 fli € K(X) | 2 eK(F) 

m° (A ' M) ' ((M ° (X)) ''' (M ^ ^ 




Table 3: Abstract transition relation 

With an abuse of notation, we may write y^" 2 (M°) in place of y^ 1 (y" 2 (M°)). Similarly, for an 
interval I = [n,m] G J and JJ G (X), 

if (t = (X = 0) J n<l, 
if (J = (X >0),n < l,m >2, 
otherwise. 

In the following we use ££8T5P a to denote the set of abstract LTS. We also assume that all notations 
defined for LTS are adapted in the obvious way. Hence, we write LTS ((£",Mq)) = (»S ,^o, Mq,E) for 
the abstract LTS, obtained for the initial abstract state Mq by transitive closure. 

For the sake of simplicity we have presented an approximation where the number of states may be 
infinite. Further approximations can be easily derived by means of widening operators (see 0). 

Soundness. 

In the style of @, we introduce an approximation order Cj 1 over abstract LTS. In this way, we can say 
that an abstract LTS lts° is a sound approximation of a LTS Its provided that ai ts (lts)^ t Jts°; as usual, 
ttlts (Its) is the best approximation of Its. 

Definition 4.3 (Best Abstraction of LTS) We define a, ts : ^STS? -> ^STS? , such that a hs {(S,^,M Q ,E)) 
= ({a(M)} M es,oc(-^),a(M Q ),E) where ce(— >) = {a(M) Q,A ' r > a (Mi) |M-^— >Mi G— >} andA° is the 

o 

best abstraction of A, derived component-wise. 

In the following, we assume to extend the order C/ over intervals to pairs of intervals; Aj° C/ A2 is 
defined component-wise. 

Definition 4.4 (Order on abstract LTS) Let lts° = (S° ,^' ,M^,E) with i G {1,2} be abstract LTS. 
For M\ G 5j,M| G S 2 , we say that M\ =4 its M 2 iff exists a relation R C S\ x S 2 such that if M\RM\ 
then: (i y )MJ > C°M|; and (ii) there exists a surjective function H t : Ts(Mi°) — > Ts(M2°) such that, for each 

t° G Ts(Mi°), t° = M° -*=^> N°, H t {tl) = q where f° = M° 2 N°, A° C, A° and N°RN°. We say 

that lts\ Ql, lts° 2 iff M ° j M ° 2 . 

The approximation order for abstract LTS is based on a simulation between abstract states. More in 
details, we say that M 2 simulates M\ (Mj° ^4i ts M 2 ) whenever M| approximates Mf, and there exists a 
surjective function // f : Ts(Mi°) — > Ts(M2°) between the transitions of M\ and M 2 . In particular, each 

Q.A?,r 0,A5,r 

move M,° — — -* N? has to be matched by a move M 2 — — -* N 2 , related to the same label 0, and such 

o o 

that Aj C/ A 2 , showing that the multiplicities are properly approximated. 
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Figure 4: The abstract LTS 



The following theorem shows that the abstract LTS computed for an abstract state M° is a sound 
approximation of the LTS, for any M represented by M°. 

Theorem 4.5 (Soundness) Let E be an environment and M° G For each M' G y(M°), we have 
a,, v (LTS((£,M'))) Q«° LTS°((£,M°)). 

Splitting hybrid states by means of the y" operator, in order to distinguish terminated and non- 
terminated states may, in general, increase drastically the number of abstract states. For example the 
abstract LTS, starting from the state Mq = {([1,2],X),([1,2],F)} w.r.t. to the environment E of Example 
13.41 would have 14 abstract states. 

It is worth noting, however, that for modeling probabilistic termination we don't need to be too fine 
in distinguishing different non-terminated states. For this reason we can apply the following widening 
operator to each abstract transition step: we approximate the new abstract state M°, result of the appli- 
cation of the transition relation of Tabled with an abstract state M°, if M C/°Mf and M° y was already 
generated in a previous derivation step. This will reduce the number of new generated abstract states as 
it is shown in the next example. 

For these reasons in the following we always assume the application of the previous widening oper- 
ator. 

Example 4.6 Fig. |?] shows the complete abstract LTS for the abstract state Mq = {([1,2],X),([1,2],Y)} 
w.r.t. to the environment E of Example \3.4\ where 

M X ° = {([2,31.x), ([1, 1],Y) M 2 ° = {([3,4],X),([0,0],Y) M 3 ° = {([2,3],X), ([0,0],Y) 
M 4 ° = {([0,0],X),([2,3],Y) M 5 ° = {([1,1],X),([2,3],Y) M 6 ° = {([0,0],X), ([3,4], F) 

5 Abstract Probabilistic Semantics 

In standard Interval Markov Chains ll34l [T3l transitions report intervals of probabilities, representing a 
lower and upper bound on the concrete probabilities, e.g. a set of possible distributions. Unfortunately, 
this information is not adequate for our abstraction. Let us consider again the system, commented in 
Examples |33] and |46l As it is illustrated in the LTS of Fig. |H the reachable states from Mq are M{, M3, 
Ml and M° (see also Fig. |5](c)). 
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Figure 5: The interval of probabilities and of multiplicities for Mq's transitions. 

In order to reason on the interval of probabilities we could safely assign to each transition leaving 
from Mq, it is useful to examine the set of concrete probability distributions, for each multiset Mq, 
represented by Mq. The DTMC corresponding to one of experiments represented by Mq is described 
in Fig. the other cases show analogous behaviors. Actually, for each Mq, there are two possible 
synchronizations between reagents X and Y: one corresponding to the duplication of X and the other one 
corresponding to the duplication of Y. These two alternative moves always have the same probability. 

Moreover, each solution Mo, when there is a duplication of X, evolves into a solution, which is 
represented either by M\ (where reagent Y is still available) or by M| (where the concentration of Y 
is 0). Analogously, for the duplication of Y and the abstract states Ml and M|. Thus, the abstract 
distributions representing the concrete distributions are: 

Pi(M 3 ) = l/2,pi(Mi) =0,pi(M 5 ) = l/2,pi(M 4 ) =0, 
p 2 (M 3 ) = l/2,p 2 (M 1 ) =0,p 2 (M 5 ) =0,p 2 (M 4 ) = 1/2, 
p 3 (M 3 ) = 0,p 3 (M 1 ) = 1/2, p 3 (M 5 ) = 1/2, p 3 (M 4 ) =0, 
p 4 (M 3 ) =0,p 4 (M 1 ) = l/2,p 4 (M 5 ) =0,p 4 (M 4 ) = 1/2. 

It should be clear that the most precise intervals of probabilities representing the previous distribu- 
tions, could be obtained by considering the minimum and maximum probability, for each move. The 
intervals we would obtain in this way, are illustrated in Fig [5] (a). This representation introduces a clear 
loss of information. For instance, the intervals include a distribution such as p(Mi) = l/2,p(M 3 ) = 
1/2, p(M 4 ) = 0,p 4 (M 5 ) = 0, which does not correspond to any concrete behavior. Actually, states M\ 
and M3 are in conflict. 

Since labels are suitably exploited in the abstract LTS in order to represent conflict, we introduce a 
generalization of the original model, called Labeled Interval Markov Chains (IMC). The model permits 
to more accurately represent the set of distributions represented by intervals of probability by means of 
labels. 

Labeled Interval Markov Chains. 

Definition 5.1 (IMC) A IMC is a tuple (5°,P",P+,L,M°) where 

1. S° C is a countable set of abstract states andM® £ S° is the initial state; 

2. P ,P + : S° — > SDistr(5°) are the lower and upper bounds on probabilities, such that for each 
Mi°,M 2 ° € S°, p-(Mi°)(M 2 °) < P+(Mi°)(M 2 °); 

3. L : 5° -> (S° -» is a labeling function. 
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In the following we use J Jt^ to denote the set of IMC. As in the standard model, P (M l )(M 2 °) 
and P + (Mi°)(M2°) define the lower and upper bound, for the move from M\° to M2 , respectively. 
In addition, L(Mi°)(M2°) reports the set of labels corresponding to the move. Intervals represent set 
of admissible distributions; the notion of admissible distribution has to be slightly adapted in order to 
handle the conflict between (sets of) labels. 

Definition 5.2 (Conflict of Labels) Let a, ft G p{&) be sets of labels. We say that a is in conflict with 
P iff there exists $ G Jz? such that a = {#} = j8. 

The notion of conflict between labels obviously induces a corresponding notion of conflict between 
states. Let (5 ,P",P + ,L,M°) be an IMC and M° G S°. We say that NS° C S° is a set of no-conflict 
states w.r.t. M° iff it is maximal and, for each M\,M\ G NS°, there is no conflict between L(M°)(M 1 °) 
and \,(M°)conceptMl). 

Definition 5.3 (Admissible Distribution) Let mc° = (S°,P-,P + ,L,M£) be an IMC and let M° G S°. 

We say that a distribution p G Distr(5°) is admissible for M° iff there exists a set of no-conflict states 
NS° such that, for each M\ G S°: ifM\ G NS°, then P ' (M°)(M°) < p(M°) < P+(M°)(M°); p(M°) = 0, 
otherwise. We use AD\str mc ° (M°) for the set of admissible distributions for M° . 

Intuitively, an admissible distribution p corresponds to a set of no-conflict states NS°, and reports 
a value included in the interval, for each state of NS°, and zero otherwise. As an example, the IMC 
illustrated in Fig[5J(b) reports four non-conflict set of states w.r.t. M ( ,: (1) {M^M^}, (2) {M 3 °,M 5 °}; (3) 
{M[,M 4 °} and (4) {Mj°,M|}. As a consequence, the admissible distributions, corresponding to (l)-(4) 
are exactly the distributions pi — P4, discussed at the beginning of the Section. This shows that the IMC 
of Fig. [5](b) is a sound (and very precise) approximation of the probabilistic semantics, for each multiset 
represented by Mq. 

Once defined admissible distributions the concept of scheduler follows the same guidelines of (51 . 
The notion of path and cylinder for IMC are analogous to that presented for DTMC. 

Definition 5.4 (Scheduler) Let mc° = (5°,P~,P + ,L,Mo) be an IMC, a scheduler is a function A: 
FPaths(S°) -> Distr(5°) such that A{n°) G ADistr mc o (jr°[|jE°|]) for any abstract path n° G FPaths(5°). 
We use Ad v(mc°) to denote the set of schedulers. 

Given a scheduler a probability space over paths can be defined analogously as for DTMC. In the fol- 
lowing, Pjjo G Adv(mc°) stands for the probability starting fromM w.r.t. the scheduler n G Adv(mc°). 

An IMC gives both under and over approximations of the probability of reachability properties, 
that can be computed by considering the worst and best probabilities w.r.t. all the schedulers. For 
approximating probabilistic termination, we have to define terminated abstract states. A state M° G 
S° of a IMC mc° = (5°,p-,P+,L, ,M£) is ^-terminated iff P+(M°)(M°) = 1, and is ^-terminated iff 
P-(M°)(M°) = 1. 

Definition 5.5 (Probabilistic Termination) Let mc° = (S°,P ,P + ,L, ,Mq) be an IMC. The lower and 
upper bound of probabilistic termination, starting from M° G S°, are 

Reach mc „ (M°) = inf ne Adv(mc°) p m° (i K ° 6 C(M°) \ n° \i] is V -terminated for some i > 0}) 
Reach+ c o (M°) = sup neAdv(mc o) PjJ„ ({71° G C(M°) \ 71° [/] is ^-terminated for some i > 0}) 

Finally, we observe that the problem of model checking the IMC can be reduced, as in the case of 
Markov Interval Chains, to the verification of a Markov Decision Process (MDP), by considering the so 
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called feasible solutions. The complexity of this reduction is comparable to the one for a standard Markov 
Interval Chains with the same number of states. Analogously, more efficient iterative algorithms which 
construct a basic feasible solution on-the-fly can also be used to model check our IMC (see [ 34l[T3l ). 

Soundness and precision of approximations. 

We introduce a notion of best abstraction of a DTMC based on an approximation order on IMC. Here, 
for a lack of space, we give just an intuitive definition of such an order. The reader can refer to @ for 
the formal definition. 

Definition 5.6 (Best Abstraction) We define (Xmc '■ — ► ' Jl^° such that OCmc((S,P,L,Mo)) = 
({a(M)} Me5 ,P„-,P a +,L,a(M )), where P«-(a(M 1 ), a(M 2 )) = P a +(a(M 1 ),a(M 2 )) = P(Mi)(M 2 ). 

The order on IMC is based on a sort of probabilistic simulation. Intuitively, M 2 ° simulates M\° 
(M\° ^4 mc M 2 °) whenever: (i) M 2 ° approximates M\°: (ii) each distribution of M\° is matched by a 
corresponding distribution of M 2 °, where the probabilities of the target states are eventually summed up. 

This simulation provides sufficient conditions for the preservation of extremum probabilities, as 
stated by the following theorem. 

Theorem 5.7 (Soundness of the order) Let mc° = (S t ,Vf ,hi,M^) be two IMC and let M,° G S°, 

fori G {1,2}. IfM x ° 4mcM 2 °, then Reach mc o(M 2 °) < Reach^M^) < Reach+ c „(Mi°) < Reach+ c o(M 2 °) 

6 Derivation of IMC 

We define a systematic method for deriving an IMC from an abstract LTS. Obviously, the crucial part 
of the translation consists of the calculation of intervals of probabilities from the information reported 
on abstract transitions labels. The approach, proposed in JH, suggests a methodology similar to the one 
applied in the concrete case, based on the calculation of abstract rates, e.g. intervals of rates. 

The idea is to derive from abstract transition labels the interval of rates rate°(?°) corresponding 
to any abstract transition t°. Then, by "summing up" the abstract rates rate°(?°) of all transitions 
t° G Ts(Mj,M£), we can obtain the abstract rate R°(Mj ,M|) for the complete move from M° y to M|. 
Analogously, we can also obtain the abstract exit rate E (Mj°) corresponding to all the moves from My. 
Finally, both lower and upper bounds of the probability of moving from Mj to M| can easily be computed 
by minimizing and maximizing the solution of R°(Mj ,M|)/°E°(Mj), resp.. 

However, the refined abstract LTS semantics presents a relevant difference: the labels represent a 
notion of conflict between abstract transitions. As an example, Fig. |5](c) reports the abstract transitions 
(see also Example 14.61 and Fig. 0]) for the abstract state Mq. Notice that just four combinations of 
transitions are possible: (a) (1) and (3); (b) (1) and (4); (c) (2) and (3); (d) (2) and (4). It should be 
clear that each combination i G {(a) — (d)} leads to a different abstract exit rate for Mq, E°(Mj°). As a 
consequence, in order to generalize the approach of O, we could minimize and maximize the solution of 
R°(M°,M^)/°E°(M°), for each combination i G {{a) - (d)}, resp.. 

It should be clear that this naive generalization of the approach would be very computationally ex- 
pensive. Therefore, we propose a more efficient approximated calculation. The idea is to compute a 
different exit rate E^o(Mj) for M°, w.r.t. each M|, reporting the abstract rate of all transitions which 
may appear in parallel with a transition of Ts(Mj ,M 2 ). This represents obviously an approximation of 
the exit rates that we would obtain by considering all combinations involving a transition of Ts(M°,M|). 

In the style of @, the abstract rates (intervals of rates) are represented by symbolic expressions on 
reagent variables, such as (e,c), where: (i) e G i2° is an expression over variables 3£; (ii) c G ^ is a set of 
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membership constraints of the form X G /Q. This approach permits to more accurately exploit the infor- 
mation recorded by abstract transition labels. Moreover, for op €{+,/} we use: (a) (ei,c\)op°(e 2 ,c 2 ) = 
(<?i ope 2 ,ciUc 2 ); (b) (e,ci)U°(e,c 2 ) = 0,ciUc 2 ), where ciUc 2 = \J X e»A x G U/(xe/)Gc i ,ie{i,2} / ))- 
The abstract rate of a transition t° = M? @,A ,r > M? can be defined as follows 

1 o z 

( (X-r,{XeI}) @ = A,A e^{E.X),A° = 1, 

rate°(0 = < (X ■ (Z-l) -r,{X G /}) = (A, At), A = (/,/), A, ju G if 

[ (X-y-r,{X Gl h Y G/ 2 }) 0=(A, J u),A° = (/i,/2),A €Sf(EJ[),H€Sf{E.Y),X ^Y. 

Then, we define E^„(M°) and R°(M°,M 2 °), where 7V C Ts(M 1 °), 

E^(Mf ) = E( eiC ) er a t e(Ts^(^)UTs(M f ^))(«^) R°(M 1 °,M 2 °) = 1° ^^^(O 

5/ o \ f (e,cL){X G [0,0] |XG Varj(e)}) if rate°(?°) = (e,c) and label(f°) G label(Ts\ iW j(M 1 )), 

ra 6 ^ ' ~ \ rate (7°) otherwise. 

rate(7V) = {r Q \ G J^r = U^^abel^©}"^ )} 

Ts\ M o(M°) = {t° G Ts(M°)|target(f°) / M 2 , label(f°) not in conflict with label(Ts(M 1 °,M 2 °))} 

Here, Ts\ M »(Mj) C Ts(M[) reports the transitions which may appear in parallel with a transition 
of Ts(M^,Mj). In the calculation of E^o(M|) the abstract rates of transitions with the same label are 
merged (namely approximated) by taking the union of the membership constraints. 

Finally, both lower and upper bounds of the probability of moving from Mj to M 2 can be derived 
by minimizing and maximizing the solution of R°(M[ ,M|)/°E^o(Mj), resp.. This reasoning has to be 
properly combined with two special cases when max(W M o (Mj°)) = or min(W M o (M° Y )) = 0. 

Definition 6.1 The abstract probabilistic translation function H° : Jfj7oS^° — > .J? -Ji^ such that 
H°((S ,^ ,Mo ,£)) = (S ,P-,P + ,L,M£), andV-,V+: S° SDistr(S°) are Slower and upper prob- 
ability functions, such that for each G S°: 

a) for each M°eS°, such that max(E° M o(M°))>0, ifmin(R°(M°,M°))=0,then also p-(M 1 °)(M|) = 0, 

otherwise, p-(M°)(M°) = min(R (M°,M°)/°E° MC ,(M°)). Analogously, the P+ function is ob- 
tained by substituting in the previous definition, the min function with the max function; 

b) if, for each M° G S°, max(E° M o (M°)) = 0, then P+ = P , P + (M°)(M°) = 1, and VM° ^ M°, 

P+(M 1 °),(M 2 °) =0; 

c) if, 3M| G S°, such that max{W M o{M\)) > and mm(E^„(M 1 )) = then P + (M 1 °)(M 1 °) = 1, and 

P-(M°),(M°) =0. 

L : 5° -» (5° -> M-^)) » « /afe/ing function defined as VM^M 2 ° G 5°, L(M^,M 2 °) = label({f° G 
Ts(M°,M 2 ) | max{rate°(t )) > 0}). 

The following theorems state the soundness of our approach. 

Theorem 6.2 Letlts- = (S° ,M / ,£) &e fwo abstract LTS. Iflts\ Q° Us lts° 2 , then W(lts\) Q° mc W (lts° 2 ). 

'We require that, £ Kara(e), there exists exactly one constraints El inc. 
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Figure 6: The IMC 



Theorem 6.3 Let E be an environment and Mq € be a multiset. 
We have a MC (H(LTS((£,M )))) C° c H°(a to (LTS((E,M )))). 

Example 6.4 Fig. ^describes the IMC, obtained from the abstract LTS of Fig. [?] for the abstract state 
M$ = {([1,2],X),([1,2],Y)} (see also Examples\MandEB- 

Note that the result is very precise. For Mq we derive precisely the approximation, discussed in Fig. 
\5\(b); namely, four admissible distributions corresponding to the combinations of labels not in conflict. 
For the other states there is exactly one admissible distribution. In particular, MS, M%, M\ and M£ are 
V '-terminated. By computing lower and upper bounds for probabilistic termination, from Mq, we obtain 
exactly one in both cases. For the maximum, it is enough to choose the admissible distributions which 
reach terminated states as soon as possible. This is obviously represented by the distribution for Mq, 
reporting probability 1 /2 to move in M3 and M|. By contrast, for the minimum, it is enough to choose 
the admissible distributions which do not reach terminated states, every time this is possible. This is 
obviously represented by the choice of the distribution for Mq, reporting probability 1/2 to move in Mj 
and My Thus, we obtain a DTMC, and the reasoning is similar to that discussed in Example \3.4\ 

This proves that each experiment, represented by Mq, leads to a terminated state with probability one, 
e.g. universally terminates. Note that here we have examined a very small example for sake of simplicity; 
however, it should be clear that the result could be generalized to any concentration of reagents X and 
Y. □ 

7 Conclusions 

The methodology proposed in this paper is substantially different from most of the approaches, proposed 
in literature |[TTl[T3l[T9ll24ll22l[T3l . in order to abstract probabilistic models, based on abstract interpre- 
tation or partitioning of the concrete state space. Actually, our goal is to represent by means of the IMC 
of an abstract system a set of concrete systems, each corresponding to a different DTMC. In this setting 
it is therefore essential to develop an effective method (even for infinite state systems) for computing 
the abstract probabilistic model, directly from the abstract LTS. The main contribution of the approach 
consists in the calculation of the intervals of probabilities from the information reported on abstract tran- 
sition labels, without building all the concrete distributions. We have also shown that the technique of 
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|[5l can be successfully generalized to the refined abstract LTS, by finding out a good trade-off between 
precision and complexity. For this reason, a probabilistic model such as a Markov Decision Process is 
not adequate. 

An advantage of our framework is that other kinds of uncertainties of biological systems could be 
handled in a similar way. For example, the approach could be easily adapted in order to model (even 
infinite) sets of concrete systems with different values for the rates. Another advantage of our framework, 
based on abstract interpretation, is that new analyses could be easily designed by introducing new abstract 
LTS semantics. For example, we would like to investigate the application of more precise numerical 
domains able to model also relational information, such as the domain of convex polyhedra. We leave to 
the future work the extension of the framework to the full calculus with communication [28 ] as well as 
the extension to Continuous-Time Markov Chains. 
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